knitr document van Steensel lab

TF reporter cDNA-count processing - K562

Introduction

I previously processed the raw sequencing data, optimized the barcode clustering, quantified the pDNA data and normalized the cDNA data. In this script, I want to have a detailed look at the cDNA data from a general perspective.

Analysis

First insights into data distribution - reporter activity distribution plots

Heat map - display mean log2-activity for each TF in each condition

Heatmap for native enhancers

Run FIMO script again

# motfn=/home/f.comoglio/mydata/Annotations/TFDB/Curated_Natoli/update_2017/20170320_pwms_selected.meme
# odir=/home/m.trauernicht/mydata/projects/tf_activity_reporter/data/SuRE_TF_1/results/native-enhancer/fimo
# query=/home/m.trauernicht/mydata/projects/tf_activity_reporter/data/SuRE_TF_1/results/native-enhancer/cDNA_df_native.fasta

# nice -n 19 fimo --no-qvalue --thresh 1e-4 --verbosity 1 --o $odir $motfn $query 

load fimo results

We built a TF motif matrix using -log10 transformed FIMO scores. We used this feature encoding throughout the rest of this analysis, unless otherwise stated.

visualize fimo results

Look at only expressed TFs in mESCs

Filter expressed TFs

Use FIMO matrix to build loglinear model

Binary presence of motif to explain expression variance

Heatmap per TF - comparing design activities mutated vs. non-mutated

Heatmap per TF - only WT TF activities

Compute activity changes relative to their negative controls

All of these heatmaps conclude that there we have informative reporters for ~10 TFs, and that the TF reporter design matters for some but not all TFs

SuperPlot of TF activity per condition - this way we can plot not only the mean, but the complete data distribution across technical and biological replicates

SuperPlots comparing different designs

Log-linear expression modelling to explain variance - model for each TF

Can expression variance be explained by the TF properties?

Session Info

paste("Run time: ",format(Sys.time()-StartTime))
## [1] "Run time:  56.5008 secs"
getwd()
## [1] "/DATA/usr/m.trauernicht/projects/SuRE-TF/gen-1_K562"
date()
## [1] "Fri Mar 12 13:45:52 2021"
sessionInfo()
## R version 3.6.3 (2020-02-29)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.7 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] tidyr_1.0.0        stringr_1.4.0      readr_1.3.1        GGally_1.5.0      
##  [5] gridExtra_2.3      cowplot_1.0.0      plyr_1.8.6         viridis_0.5.1     
##  [9] viridisLite_0.3.0  ggforce_0.3.1      ggbeeswarm_0.6.0   ggpubr_0.2.5      
## [13] magrittr_1.5       pheatmap_1.0.12    tibble_3.0.1       maditr_0.6.3      
## [17] dplyr_0.8.5        ggplot2_3.3.0      RColorBrewer_1.1-2
## 
## loaded via a namespace (and not attached):
##  [1] prettydoc_0.4.0   beeswarm_0.2.3    tidyselect_1.1.0  xfun_0.19        
##  [5] purrr_0.3.3       lattice_0.20-38   splines_3.6.3     colorspace_1.4-1 
##  [9] vctrs_0.2.4       htmltools_0.5.0   mgcv_1.8-31       yaml_2.2.1       
## [13] rlang_0.4.8       pillar_1.4.3      glue_1.4.2        withr_2.1.2      
## [17] tweenr_1.0.1      lifecycle_0.2.0   munsell_0.5.0     ggsignif_0.6.0   
## [21] gtable_0.3.0      evaluate_0.14     labeling_0.3      knitr_1.30       
## [25] vipor_0.4.5       Rcpp_1.0.5        scales_1.1.0      farver_2.0.1     
## [29] hms_0.5.3         digest_0.6.27     stringi_1.5.3     polyclip_1.10-0  
## [33] grid_3.6.3        tools_3.6.3       crayon_1.3.4      pkgconfig_2.0.3  
## [37] Matrix_1.2-18     ellipsis_0.3.0    MASS_7.3-51.5     data.table_1.12.8
## [41] assertthat_0.2.1  rmarkdown_2.5     reshape_0.8.8     R6_2.5.0         
## [45] nlme_3.1-143      compiler_3.6.3